Learning motions from demonstrations and rewards with time-invariant dynamical systems based policies
نویسندگان
چکیده
An important challenge when using Reinforcement Learning for learning motions in robotics is the choice of parameterization for the policy. We use Gaussian Mixture Regression to extract a parameterization with relevant non-linear features from a set of demonstrations of a motion following the paradigm of Learning from Demonstration. The resulting parameterization takes the form of a non-linear time-invariant dynamical system (DS). We use this time-invariant DS as a parameterized policy for a variant of the PI policy search algorithm. This paper contributes by adapting PI for our time-invariant motion representation. We introduce two novel parameter exploration schemes that can be used to 1) sample model parameters to achieve a uniform exploration in state space and 2) explore while ensuring stability of the resulting motion model. Additionally, a state dependent stiffness profile is learned simultaneously to the reference trajectory and both are used together in a variable impedance control architecture. This learning architecture is validated in a hardware experiment consisting of a digging task using a KUKA LWR platform.
منابع مشابه
Learning Partially Contracting Dynamical Systems from Demonstrations
An algorithm for learning the dynamics of point-to-point motions from demonstrations using an autonomous nonlinear dynamical system, named contracting dynamical system primitives (CDSP), is presented. The motion dynamics are approximated using a Gaussian mixture model (GMM) and its parameters are learned subject to constraints derived from partial contraction analysis. Systems learned using the...
متن کاملLearning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions
We consider an imitation learning approach to model robot point-to-point (also known as discrete or reaching) movements with a set of autonomous Dynamical Systems (DS). Each DS model codes a behavior (such as reaching for a cup and swinging a golf club) at the kinematic level. An estimate of these DS models are usually obtained from a set of demonstrations of the task. When modeling robot discr...
متن کاملLearning robot motions with stable dynamical systems under diffeomorphic transformations
Accuracy and stability have in recent studies been emphasized as the two major ingredients to learn robot motions from demonstrations with dynamical systems. Several approaches yield stable dynamical systems but are also limited to specific dynamics that can potentially result in a poor reproduction performance. The current work addresses this accuracy-stability dilemma through a new diffeomorp...
متن کاملCoupled dynamical system based arm-hand grasping model for learning fast adaptation strategies
Performing manipulation tasks interactively in real environments requires a high degree of accuracy and stability. At the same time, when one cannot assume a fully deterministic and static environment, one must endow the robot with the ability to react rapidly to sudden changes in the environment. These considerations make the task of reach and grasp difficult to deal with. We follow a programm...
متن کاملPrediction of Above-elbow Motions in Amputees, based on Electromyographic(EMG) Signals, Using Nonlinear Autoregressive Exogenous (NARX) Model
Introduction In order to improve the quality of life of amputees, biomechatronic researchers and biomedical engineers have been trying to use a combination of various techniques to provide suitable rehabilitation systems. Diverse biomedical signals, acquired from a specialized organ or cell system, e.g., the nervous system, are the driving force for the whole system. Electromyography(EMG), as a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Auton. Robots
دوره 42 شماره
صفحات -
تاریخ انتشار 2018